Grid RPC meets Data Grid: Network Enabled Services for Data Farming on the Grid

نویسنده

  • Satoshi Matsuoka
چکیده

The Computational Grid[1] is a promising platform for running large-scale scientific applications. It provides a base software infrastructure that allows for the development of middleware aimed at deploying applications on Grid resources. The question is, how do you program it---in this regard, Network-Enabled Server (NES) paradigm, which enables Grid-based RPC, or GridRPC for short is a good candidate as a viable Grid middleware that offers a simple yet powerful programming paradigm for programming on the Grid. Several systems that facilitate whole or parts of the paradigm are already in existence, such as Neos[7], Netsolve[3], Nimrod/G[4], Ninf[2], and RCS[6], and we feel that pursuit of a common design in GridRPC, as had been done for MPI for message passing, will bring benefits of standardized programming model to the Grid world. This talk will introduce the NES/Grid RPC features, discuss early user experiences, and touch upon the Grid Data Farm project, based on Grid RPC, which involving processing Petabytes of collider accelerator data streaming over the Euro-Japanese link with thousands-node scale cluster possibly spread over several Japanese institutions. Compared to traditional RPC systems, such as CORBA, designed for applications that facilitate nonscientific applications, GridRPC systems offer features and capabilities that make it easy to program mediumto coarse-grained, task parallel applications that involve hundreds to thousands or more high-performance nodes, either concentrated as a tightly coupled cluster, or a set of them spread over a wide-area network. Such applications will often require handling of shipping megabytes of multidimensional array data in a user-transparent and efficient way, as well as requiring the support of RPC calls that range anywhere from 100s of milliseconds up to several days or even weeks. There are other necessary features of Grid RPC systems such as dynamic resource discovery, dynamic load balancing, fault tolerance, security (multisite authentication, delegation of authentication, adapting to multiple security policies, etc.), easy-to-use client/server management, firewall and private address considerations, remote large file and I/O support etc. These features are essentially what is needed for the Grid RPC systems to execute well on the Grid---features either missing or incomplete in traditional `closed world’ RPC systems ---and in fact are what are provided by lower level Grid substrates such as Condor[10], Globus[8], and Legion[9]. As such GridRPC systems either provide these features themselves, or builds upon the features provided by such substrates.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Mobile Grid Performance Using Fuzzy Job Replica Count Determiner

Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common computational platform. Mobile Computing is a Generic word that introduces using of movable, handheld devices with wireless communication, for processing data. Mobile Computing focused on providing access to data, information, services and communications anywhere an...

متن کامل

Improving Mobile Grid Performance Using Fuzzy Job Replica Count Determiner

Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common computational platform. Mobile Computing is a Generic word that introduces using of movable, handheld devices with wireless communication, for processing data. Mobile Computing focused on providing access to data, information, services and communications anywhere an...

متن کامل

A New Job Scheduling in Data Grid Environment Based on Data and Computational Resource Availability

Data Grid is an infrastructure that controls huge amount of data files, and provides intensive computational resources across geographically distributed collaboration. The heterogeneity and geographic dispersion of grid resources and applications place some complex problems such as job scheduling. Most existing scheduling algorithms in Grids only focus on one kind of Grid jobs which can be data...

متن کامل

Short Message Service in a Grid-Enabled Computing Environment

Pervasive computing provides an attractive vision for the future of computing. Mobile computing devices such as mobile phones together with a land-based and wireless communication network infrastructure are the existing technical prerequisites for continuous access to networked services. The security of the system is a high concern in this environment, as well as its usability. This paper prese...

متن کامل

Developing LHCb Grid software: experiences and advances

The LHCb grid software has been used for two Physics Data Challenges, the most recent of which will have produced 90 TB of data and required over 400 processor-years of computing power. This paper discusses the group’s experience with developing Grid Services, interfacing to the LCG, running LHCb experiment software on the grid, and the integration of a number of new technologies into the LHCb ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001